Goto

Collaborating Authors

 multi-class classification



H-Consistency Bounds: Characterization and Extensions

Neural Information Processing Systems

A series of recent publications by Awasthi, Mao, Mohri, and Zhong [2022b] have introduced the key notion of H-consistency bounds for surrogate loss functions. These are upper bounds on the zero-one estimation error of any predictor in a hypothesis set, expressed in terms of its surrogate loss estimation error. They are both non-asymptotic and hypothesis set-specific and thus stronger and more informative than Bayes-consistency. However, determining if they hold and deriving these bounds have required a specific proof and analysis for each surrogate loss. Can we derive more general tools and characterizations?


ALabel model and illustrations

Neural Information Processing Systems

A.1 Majority Voting The Majority Voting (MV) is the most intuitive algorithm for aggregate LFs' annotations. We omit this case for simplicity. A.3 Snorkel MeTaL The parameters µof Snorkel MeTaL [31] are given by Bayes' theorem we have: pµ(y = c,λ = m) = pµ(λ = m | y = c)p(y = c) = Consider a label model g(L(x),x) F in arbitrary functional class F, e.g., neural network, and having additional dependency on data feature x4, we can still approximate such complicated function with identity function-based label model g W(x)(L(x)) similar to the aforementioned one except that W(x): X RM (C+1) C is a similarly complicated function, e.g., neural network, that maps each data x X to a unique label model parameter W(x). We leave the exploration of more complicated form of label models into future work. B.1 Case 1: identity function We define the loss with reweighted sample as, Instead of employing the decomposing loss function, we introduce a more general influence estimation method - weight-moving Influence, which get ride of the loss decomposition and approximation and is agnostic to the selection of σ() function.




A Generalised Exponentiated Gradient Approach to Enhance Fairness in Binary and Multi-class Classification Tasks

arXiv.org Machine Learning

The widespread use of AI and ML models in sensitive areas raises significant concerns about fairness. While the research community has introduced various methods for bias mitigation in binary classification tasks, the issue remains under-explored in multi-class classification settings. To address this limitation, in this paper, we first formulate the problem of fair learning in multi-class classification as a multi-objective problem between effectiveness (i.e., prediction correctness) and multiple linear fairness constraints. Next, we propose a Generalised Exponentiated Gradient (GEG) algorithm to solve this task. GEG is an in-processing algorithm that enhances fairness in binary and multi-class classification settings under multiple fairness definitions. We conduct an extensive empirical evaluation of GEG against six baselines across seven multi-class and three binary datasets, using four widely adopted effectiveness metrics and three fairness definitions. GEG overcomes existing baselines, with fairness improvements up to 92% and a decrease in accuracy up to 14%.


A Universal Growth Rate for Learning with Smooth Surrogate Losses

Neural Information Processing Systems

This paper presents a comprehensive analysis of the growth rate of $H$-consistency bounds (and excess error bounds) for various surrogate losses used in classification. We prove a square-root growth rate near zero for smooth margin-based surrogate losses in binary classification, providing both upper and lower bounds under mild assumptions.



Multi-Class Learning: From Theory to Algorithm

Neural Information Processing Systems

Moreover,the proposed multi-class kernel learning algorithms have statistical guarantees and fast convergence rates. Experimental results on lots of benchmark datasets show that our proposed methods can significantly outperform the existing multi-class classification methods. The major contributions ofthispaper include: 1)Anewlocal Rademacher complexitybased bound withfastconvergence rate for multi-class classification is established. Existing works [16,27] for multi-class classifiers with Rademacher complexity does not take into account couplings among different classes.